Search CORE

92 research outputs found

Prediction of scientific collaborations through multiplex interaction networks

Author: Aleta Alberto
Moreno Yamir
Paolotti Daniela
Starnini Michele
Tuninetti Marta
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2020
Field of study

Link prediction algorithms can help to understand the structure and dynamics of scientific collaborations and the evolution of Science. However, available algorithms based on similarity between nodes of collaboration networks are bounded by the limited amount of links present in these networks. In this work, we reduce the latter intrinsic limitation by generalizing the Adamic-Adar method to multiplex networks composed by an arbitrary number of layers, that encode diverse forms of scientific interactions. We show that the new metric outperforms other single-layered, similarity-based scores and that scientific credit, represented by citations, and common interests, measured by the usage of common keywords, can be predictive of new collaborations. Our work paves the way for a deeper understanding of the dynamics driving scientific collaborations, and provides a new algorithm for link prediction in multiplex networks that can be applied to a plethora of systems

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Repositorio Universidad de Zaragoza

Towards a data-driven characterization of behavioral changes induced by the seasonal flu

Author: Gozzi Nicolo
Paolotti Daniela
Perra Perra
Perrotta Daniela
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

In this work, we aim to determine the main factors driving self-initiated behavioral changes during the seasonal flu. To this end, we designed and deployed a questionnaire via Influweb, a Web platform for participatory surveillance in Italy, during the 2017 − 18 and 2018 − 19 seasons. We collected 599 surveys completed by 434 users. The data provide socio-demographic information, level of concerns about the flu, past experience with illnesses, and the type of behavioral changes voluntarily implemented by each participant. We describe each response with a set of features and divide them in three target categories. These describe those that report i) no (26%), ii) only moderately (36%), iii) significant (38%) changes in behaviors. In these settings, we adopt machine learning algorithms to investigate the extent to which target variables can be predicted by looking only at the set of features. Notably, 66% of the samples in the category describing more significant changes in behaviors are correctly classified through Gradient Boosted Trees. Furthermore, we investigate the importance of each feature in the classification task and uncover complex relationships between individuals’ characteristics and their attitude towards behavioral change. We find that intensity, recency of past illnesses, perceived susceptibility to and perceived severity of an infection are the most significant features in the classification task and are associated to significant changes in behaviors. Overall, the research contributes to the small set of empirical studies devoted to the data-driven characterization of behavioral changes induced by infectious disease

arXiv.org e-Print Archive

Greenwich Academic Literature Archive

Directory of Open Access Journals

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Monitoring Gender Gaps via LinkedIn Advertising Estimates: the case study of Italy

Author: Bertè Margherita
Kalimeri Kyriaki
Paolotti Daniela
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/03/2023
Field of study

Women remain underrepresented in the labour market. Although significant advancements are being made to increase female participation in the workforce, the gender gap is still far from being bridged. We contribute to the growing literature on gender inequalities in the labour market, evaluating the potential of the LinkedIn estimates to monitor the evolution of the gender gaps sustainably, complementing the official data sources. In particular, assessing the labour market patterns at a subnational level in Italy. Our findings show that the LinkedIn estimates accurately capture the gender disparities in Italy regarding sociodemographic attributes such as gender, age, geographic location, seniority, and industry category. At the same time, we assess data biases such as the digitalisation gap, which impacts the representativity of the workforce in an imbalanced manner, confirming that women are under-represented in Southern Italy. Additionally to confirming the gender disparities to the official census, LinkedIn estimates are a valuable tool to provide dynamic insights; we showed an immigration flow of highly skilled women, predominantly from the South. Digital surveillance of gender inequalities with detailed and timely data is particularly significant to enable policymakers to tailor impactful campaigns.Comment: 10 page

arXiv.org e-Print Archive

Developing Real Estate Automated Valuation Models by Learning from Heterogeneous Data Sources

Author: Bergadano Francesco
Bertilone Roberto
Paolotti Daniela
Ruffo Giancarlo
Publication venue: 'Penerbit UTM Press'
Publication date: 01/01/2021
Field of study

In this paper we propose a data acquisition methodology, and a Machine Learning solution for the partially automated evaluation of real estate properties. The novelty and importance of the approach lies in two aspects: (1) when compared to Automated Valuation Models (AVMs) as available to real estate operators, it is highly adaptive and non-parametric, and integrates diverse data sources; (2) when compared to Machine Learning literature that has addressed real estate applications, it is more directly linked to the actual business processes of appraisal companies: in this context prices that are advertised online are normally not the most relevant source of information, while an appraisal document must be proposed by an expert and approved by a validator, possibly with the help of technological tools. We describe a case study using a set of 7988 appraisal documents for residential properties in Turin, Italy. Open data were also used, including location, nearby points of interest, comparable property prices, and the Italian revenue service area code. The observed mean error as measured on an independent test set was around 21 K€, for an average property value of about 190 K€. The AVM described here can help the stakeholders in this process (experts, appraisal company) to provide a reference price to be used by the expert, to allow the appraisal company to validate their evaluations in a faster and cheaper way, to help the expert in listing a set of comparable properties, that need to be included in the appraisal document

International Journal of Real Estate Studies

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

News and the city: understanding online press consumption patterns through mobile data

Author: Ferres Leo
Paolotti Daniela
Ruffo Giancarlo
Vilella Salvatore
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The always increasing mobile connectivity affects every aspect of our daily lives, including how and when we keep ourselves informed and consult news media. By studying a DPI (deep packet inspection) dataset, provided by one of the major Chilean telecommunication companies, we investigate how different cohorts of the population of Santiago De Chile consume news media content through their smartphones. We find that some socio-demographic attributes are highly associated to specific news media consumption patterns. In particular, education and age play a significant role in shaping the consumers behaviour even in the digital context, in agreement with a large body of literature on off-line media distribution channels

arXiv.org e-Print Archive

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Institutional Research Information System University of Turin

The Impact of Disinformation on a Controversial Debate on Social Media

Author: Paolotti Daniela
Ruffo Giancarlo
Semeraro Alfonso
Vilella Salvatore
Publication venue
Publication date: 30/06/2021
Field of study

In this work we study how pervasive is the presence of disinformation in the Italian debate around immigration on Twitter and the role of automated accounts in the diffusion of such content. By characterising the Twitter users with an \textit{Untrustworthiness} score, that tells us how frequently they engage with disinformation content, we are able to see that such bad information consumption habits are not equally distributed across the users; adopting a network analysis approach, we can identify communities characterised by a very high presence of users that frequently share content from unreliable news sources. Within this context, social bots tend to inject in the network more malicious content, that often remains confined in a limited number of clusters; instead, they target reliable content in order to diversify their reach. The evidence we gather suggests that, at least in this particular case study, there is a strong interplay between social bots and users engaging with unreliable content, influencing the diffusion of the latter across the network

arXiv.org e-Print Archive

PubMed Central

Open Access Repository

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Institutional Research Information System University of Turin

Recommended from our members

Forecasting seasonal influenza fusing digital indicators and a mechanistic disease model

Author: Paolotti Daniela
Perra Nicola
Perrotta Daniela
Tizzoni Michele
Vespignani Alessandro
Zhang Qian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/04/2017
Field of study

The availability of novel digital data streams that can be used as proxy for monitoring infectious disease incidence is ushering in a new era for real-time forecast approaches to disease spreading. Here, we propose the first seasonal influenza forecast framework based on a stochastic, spatially structured mechanistic model (individual level microsimulation) initialized with geo-localized microblogging data. The framework provides for more than 600 census areas in the United States, Italy and Spain, the initial conditions for a stochastic epidemic computational model that generates an ensemble of forecasts for the main indicators of the epidemic season: peak time and intensity. We evaluate the forecasts accuracy and reliability by comparing the results from our framework with the data from the official influenza surveillance systems in the US, Italy and Spain in the seasons 2014/15 and 2015/16. In all countries studied, the proposed framework provides reliable results with leads of up to 6 weeks that became more stable and accurate with progression of the season. The results for the United States have been generated in real-time in the context of the Centers for Disease Control and Prevention “Forecasting the Influenza Season Challenge". A characteristic feature of the mechanistic modeling approach is in the explicit estimate of key epidemiological parameters relevant for public health decision-making that cannot be achieved with statistical models not considering the disease dynamic. Furthermore, the presented framework allows the fusion of multiple data streams in the initialization stage and can be enriched with census, weather and socioeconomic data

Greenwich Academic Literature Archive

Developing Real Estate Automated Valuation Models by Learning from Heterogeneous Data Sources

Author: Bergadano Francesco
Bertilone Roberto
Paolotti Daniela
Ruffo Giancarlo
Publication venue: 'Penerbit UTM Press'
Publication date: 23/06/2021
Field of study

International Journal of Real Estate Studies

Immigration as a Divisive Topic: Clusters and Content Diffusion in the Italian Twitter Debate

Author: Lai Mirko
Paolotti Daniela
Ruffo Giancarlo
Vilella Salvatore
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

In this work, we apply network science to analyse almost 6 M tweets about the debate around immigration in Italy, collected between 2018 and 2019, when many related events captured media outlets’ attention. Our aim was to better understand the dynamics underlying the interactions on social media on such a delicate and divisive topic, which are the actors that are leading the discussion, and whose messages have the highest chance to reach out the majority of the accounts that are following the debate. The debate on Twitter is represented with networks; we provide a characterisation of the main clusters by looking at the highest in-degree nodes in each one and by analysing the text of the tweets of all the users. We find a strongly segregated network which shows an explicit interplay with the Italian political and social landscape, that however seems to be disconnected from the actual geographical distribution and relocation of migrants. In addition, quite surprisingly, the influencers and political leaders that apparently lead the debate, do not necessarily belong to the clusters that include the majority of nodes: we find evidence of the existence of a `silent majority’ that is more connected to accounts who expose a more positive stance toward migrants, while leaders whose stance is negative attract apparently more attention. Finally, we see that the community structure clearly affects the diffusion of content (URLs) by identifying the presence of both local and global trends of diffusion, and that communities tend to display segregation regardless of their political and cultural background. In particular, we observe that messages that spread widely in the two largest clusters, whose most popular members are also notoriously at the opposite sides of the political spectrum, have a very low chance to get visibility into other clusters

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Institutional Research Information System University of Turin